Sentence Compression as a Step in Summarization or an Alternative Path in Text Shortening

نویسندگان

  • Mehdi Yousfi Monod
  • Violaine Prince
چکیده

The originality of this work leads in tackling text compression using an unsupervised method, based on a deep linguistic analysis, and without resorting on a learning corpus. This work presents a system for dependent tree pruning, while preserving the syntactic coherence and the main informational contents, and led to an operational software, named COLIN. Experiment results show that our compressions get honorable satisfaction levels, with a mean compression ratio of 38 %.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Summarization with a Joint Model for Sentence Extraction and Compression

Text summarization is one of the oldest problems in natural language processing. Popular approaches rely on extracting relevant sentences from the original documents. As a side effect, sentences that are too long but partly relevant are doomed to either not appear in the final summary, or prevent inclusion of other relevant sentences. Sentence compression is a recent framework that aims to sele...

متن کامل

NAACL HLT 2009 Integer Linear Programming for Natural Language Processing

Text summarization is one of the oldest problems in natural language processing. Popular approaches rely on extracting relevant sentences from the original documents. As a side effect, sentences that are too long but partly relevant are doomed to either not appear in the final summary, or prevent inclusion of other relevant sentences. Sentence compression is a recent framework that aims to sele...

متن کامل

بهبود خلاصه سازی خودکار متون فارسی با استفاده از روش‌های پردازش زبان طبیعی و گراف شباهت

A significant amount of available information is stored in textual databases which contains a large collection of documents from different sources (such as news, articles, books, emails and web pages). The increasing visibility and importance of this class of information motivates us to work on having better automatic evaluation tools for textual resources. The automatic summarization of tex...

متن کامل

Automatic Text Summarization Using Two-Step Sentence Extraction

Automatic text summarization sets the goal at reducing the size of a document while preserving its content. Our summarization system is based on Two-step Sentence Extraction. As it combines statistical methods and reduces noise data through two steps efficiently, it can achieve high performance. In our experiments for 30% compression and 10% compression, our method is compared with Title, Locat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008